A Physics-Guided Neural Network for Predicting Protein-Ligand Binding Free Energy: From Host-Guest Systems to the PDBbind Database

Biomolecules. 2022 Jun 29;12(7):919. doi: 10.3390/biom12070919.

Abstract

Calculation of protein-ligand binding affinity is a cornerstone of drug discovery. Classic implicit solvent models, which have been widely used to accomplish this task, lack accuracy compared to experimental references. Emerging data-driven models, on the other hand, are often accurate yet not fully interpretable and also likely to be overfitted. In this research, we explore the application of Theory-Guided Data Science in studying protein-ligand binding. A hybrid model is introduced by integrating Graph Convolutional Network (data-driven model) with the GBNSR6 implicit solvent (physics-based model). The proposed physics-data model is tested on a dataset of 368 complexes from the PDBbind refined set and 72 host-guest systems. Results demonstrate that the proposed Physics-Guided Neural Network can successfully improve the "accuracy" of the pure data-driven model. In addition, the "interpretability" and "transferability" of our model have boosted compared to the purely data-driven model. Further analyses include evaluating model robustness and understanding relationships between the physical features.

Keywords: binding free energy; graph convolutional network; implicit solvent model.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Ligands
  • Neural Networks, Computer*
  • Physics
  • Protein Binding
  • Proteins* / chemistry
  • Solvents / chemistry
  • Thermodynamics

Substances

  • Ligands
  • Proteins
  • Solvents

Grants and funding

This research was partially funded by the National Science Foundation (NSF) Grant No. 2136095 to N.F.