Are Deep Learning Structural Models Sufficiently Accurate for Virtual Screening? Application of Docking Algorithms to AlphaFold2 Predicted Structures

Anna M Díaz-Rovira; Helena Martín; Thijs Beuming; Lucía Díaz; Victor Guallar; Soumya S Ray

doi:10.1021/acs.jcim.2c01270

Are Deep Learning Structural Models Sufficiently Accurate for Virtual Screening? Application of Docking Algorithms to AlphaFold2 Predicted Structures

J Chem Inf Model. 2023 Mar 27;63(6):1668-1674. doi: 10.1021/acs.jcim.2c01270. Epub 2023 Mar 9.

Authors

Anna M Díaz-Rovira¹, Helena Martín², Thijs Beuming³, Lucía Díaz², Victor Guallar^{1

2

4}, Soumya S Ray^{5

6}

Affiliations

¹ Barcelona Supercomputing Center, Jordi Girona 29, E-08034 Barcelona, Spain.
² Nostrum Biodiscovery S.L., E-08029 Barcelona, Spain.
³ Latham Biopharm Group, 101 Main Street, Suite 1400, Cambridge, Massachusetts 02142, United States.
⁴ ICREA, Passeig Lluís Companys 23, E-08010 Barcelona, Spain.
⁵ RA Capital, 200 Berkeley Street, Boston, Massachusetts 02116, United States.
⁶ 3-Dimensional Consulting, 134 Franklin Avenue, Quincy, Massachusetts 02170, United States.

PMID: 36892986
DOI: 10.1021/acs.jcim.2c01270

Abstract

Machine learning-based protein structure prediction algorithms, such as RosettaFold and AlphaFold2, have greatly impacted the structural biology field, arousing a fair amount of discussion around their potential role in drug discovery. While there are few preliminary studies addressing the usage of these models in virtual screening, none of them focus on the prospect of hit-finding in a real-world virtual screen with a model based on low prior structural information. In order to address this, we have developed an AlphaFold2 version where we exclude all structural templates with more than 30% sequence identity from the model-building process. In a previous study, we used those models in conjunction with state-of-the-art free energy perturbation methods and demonstrated that it is possible to obtain quantitatively accurate results. In this work, we focus on using these structures in rigid receptor-ligand docking studies. Our results indicate that using out-of-the-box Alphafold2 models is not an ideal scenario for virtual screening campaigns; in fact, we strongly recommend to include some post-processing modeling to drive the binding site into a more realistic holo model.

Publication types

Research Support, N.I.H., Intramural

MeSH terms

Algorithms
Deep Learning*
Ligands
Molecular Docking Simulation
Protein Binding
Protein Conformation
Proteins / chemistry

Substances

Ligands
Proteins