SiteFerret: Beyond Simple Pocket Identification in Proteins

J Chem Theory Comput. 2023 Aug 8;19(15):5242-5259. doi: 10.1021/acs.jctc.2c01306. Epub 2023 Jul 20.

Abstract

We present a novel method for the automatic detection of pockets on protein molecular surfaces. The algorithm is based on an ad hoc hierarchical clustering of virtual probe spheres obtained from the geometrical primitives used by the NanoShaper software to build the solvent-excluded molecular surface. The final ranking of putative pockets is based on the Isolation Forest method, an unsupervised learning approach originally developed for anomaly detection. A detailed importance analysis of pocket features provides insight into which geometrical (clustering) and chemical (amino acidic composition) properties characterize a good binding site. The method also provides a segmentation of pockets into smaller subpockets. We prove that subpockets are a convenient representation to pinpoint the binding site with great precision. SiteFerret is outstanding in its versatility, accurately predicting a wide range of binding sites, from those binding small molecules to those binding peptides, including difficult shallow sites.

MeSH terms

  • Algorithms
  • Binding Sites
  • Ligands
  • Peptides / metabolism
  • Protein Binding
  • Protein Conformation
  • Proteins* / chemistry
  • Software*

Substances

  • Proteins
  • Peptides
  • Ligands