Elucidating the druggability of the human proteome with eFindSite

J Comput Aided Mol Des. 2019 May;33(5):509-519. doi: 10.1007/s10822-019-00197-w. Epub 2019 Mar 19.

Abstract

Identifying the viability of protein targets is one of the preliminary steps of drug discovery. Determining the ability of a protein to bind drugs in order to modulate its function, termed the druggability, requires a non-trivial amount of time and resources. Inability to properly measure druggability has accounted for a significant portion of failures in drug discovery. This problem is only further exacerbated by the large sample space of proteins involved in human diseases. With these barriers, the druggability space within the human proteome remains unexplored and has made it difficult to develop drugs for numerous diseases. Hence, we present a new feature developed in eFindSite that employs supervised machine learning to predict the druggability of a given protein. Benchmarking calculations against the Non-Redundant data set of Druggable and Less Druggable binding sites demonstrate that an AUC for druggability prediction with eFindSite is as high as 0.88. With eFindSite, we elucidated the human druggability space to be 10,191 proteins. Considering the disease space from the Open Targets Platform and excluding already known targets from the predicted data set reveal 2731 potentially novel therapeutic targets. eFindSite is freely available as a stand-alone software at https://github.com/michal-brylinski/efindsite .

Keywords: Drug targets; Druggability prediction; Human proteome; Molecular modeling; Pocket prediction; Structural bioinformatics; eFindSite.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • 5-Aminolevulinate Synthetase / chemistry
  • 5-Aminolevulinate Synthetase / metabolism
  • Binding Sites
  • Drug Design
  • Drug Discovery / methods*
  • Humans
  • Protein Binding
  • Proteins / chemistry
  • Proteins / metabolism*
  • Proteome / chemistry
  • Proteome / metabolism
  • Serine Proteases / chemistry
  • Serine Proteases / metabolism
  • Software
  • Supervised Machine Learning*

Substances

  • Proteins
  • Proteome
  • 5-Aminolevulinate Synthetase
  • ALAS2 protein, human
  • ABHD11 protein, human
  • Serine Proteases