End-to-end protein-ligand complex structure generation with diffusion-based generative models

BMC Bioinformatics. 2023 Jun 5;24(1):233. doi: 10.1186/s12859-023-05354-5.

Abstract

Background: Three-dimensional structures of protein-ligand complexes provide valuable insights into their interactions and are crucial for molecular biological studies and drug design. However, their high-dimensional and multimodal nature hinders end-to-end modeling, and earlier approaches depend inherently on existing protein structures. To overcome these limitations and expand the range of complexes that can be accurately modeled, it is necessary to develop efficient end-to-end methods.

Results: We introduce an equivariant diffusion-based generative model that learns the joint distribution of ligand and protein conformations conditioned on the molecular graph of a ligand and the sequence representation of a protein extracted from a pre-trained protein language model. Benchmark results show that this protein structure-free model is capable of generating diverse structures of protein-ligand complexes, including those with correct binding poses. Further analyses indicate that the proposed end-to-end approach is particularly effective when the ligand-bound protein structure is not available.

Conclusion: The present results demonstrate the effectiveness and generative capability of our end-to-end complex structure modeling framework with diffusion-based generative models. We suppose that this framework will lead to better modeling of protein-ligand complexes, and we expect further improvements and wide applications.

Keywords: Deep generative model; Molecular interaction; Protein structure prediction; Protein–ligand complex.

MeSH terms

  • Drug Design*
  • Ligands
  • Protein Conformation
  • Proteins* / chemistry

Substances

  • Ligands
  • Proteins