Automated large-scale file preparation, docking, and scoring: evaluation of ITScore and STScore using the 2012 Community Structure-Activity Resource benchmark

J Chem Inf Model. 2013 Aug 26;53(8):1905-14. doi: 10.1021/ci400045v. Epub 2013 May 21.

Abstract

In this study, we use the recently released 2012 Community Structure-Activity Resource (CSAR) data set to evaluate two knowledge-based scoring functions, ITScore and STScore, and a simple force-field-based potential (VDWScore). The CSAR data set contains 757 compounds, most with known affinities, and 57 crystal structures. With the help of the script files for docking preparation, we use the full CSAR data set to evaluate the performances of the scoring functions on binding affinity prediction and active/inactive compound discrimination. The CSAR subset that includes crystal structures is used as well, to evaluate the performances of the scoring functions on binding mode and affinity predictions. Within this structure subset, we investigate the importance of accurate ligand and protein conformational sampling and find that the binding affinity predictions are less sensitive to non-native ligand and protein conformations than the binding mode predictions. We also find the full CSAR data set to be more challenging in making binding mode predictions than the subset with structures. The script files used for preparing the CSAR data set for docking, including scripts for canonicalization of the ligand atoms, are offered freely to the academic community.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Automation
  • Crystallography, X-Ray
  • Databases, Pharmaceutical*
  • Electronic Data Processing*
  • Molecular Docking Simulation / methods*
  • Protein Conformation
  • Structure-Activity Relationship