Supervised machine learning techniques to predict binding affinity. A study for cyclin-dependent kinase 2

Maurício Boff de Ávila; Mariana Morrone Xavier; Val Oliveira Pintro; Walter Filgueira de Azevedo Jr

doi:10.1016/j.bbrc.2017.10.035

Supervised machine learning techniques to predict binding affinity. A study for cyclin-dependent kinase 2

Biochem Biophys Res Commun. 2017 Dec 9;494(1-2):305-310. doi: 10.1016/j.bbrc.2017.10.035. Epub 2017 Oct 7.

Authors

Maurício Boff de Ávila¹, Mariana Morrone Xavier², Val Oliveira Pintro², Walter Filgueira de Azevedo Jr³

Affiliations

¹ Laboratory of Computational Systems Biology, School of Sciences - Pontifical Catholic University of Rio Grande do Sul (PUCRS), Av. Ipiranga, 6681, Porto Alegre, RS 90619-900, Brazil; Graduate Program in Cellular and Molecular Biology, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Av. Ipiranga, 6681, Porto Alegre, RS 90619-900, Brazil.
² Laboratory of Computational Systems Biology, School of Sciences - Pontifical Catholic University of Rio Grande do Sul (PUCRS), Av. Ipiranga, 6681, Porto Alegre, RS 90619-900, Brazil.
³ Laboratory of Computational Systems Biology, School of Sciences - Pontifical Catholic University of Rio Grande do Sul (PUCRS), Av. Ipiranga, 6681, Porto Alegre, RS 90619-900, Brazil; Graduate Program in Cellular and Molecular Biology, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Av. Ipiranga, 6681, Porto Alegre, RS 90619-900, Brazil. Electronic address: walter@azevedolab.net.

PMID: 29017921
DOI: 10.1016/j.bbrc.2017.10.035

Abstract

Here we report the development of a machine-learning model to predict binding affinity based on the crystallographic structures of protein-ligand complexes. We used an ensemble of crystallographic structures (resolution better than 1.5 Å resolution) for which half-maximal inhibitory concentration (IC₅₀) data is available. Polynomial scoring functions were built using as explanatory variables the energy terms present in the MolDock and PLANTS scoring functions. Prediction performance was tested and the supervised machine learning models showed improvement in the prediction power, when compared with PLANTS and MolDock scoring functions. In addition, the machine-learning model was applied to predict binding affinity of CDK2, which showed a better performance when compared with AutoDock4, AutoDock Vina, MolDock, and PLANTS scores.

Keywords: Bioinformatics; CDK2; Docking; Drug design; Kinase; Machine learning.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Antineoplastic Agents / chemistry*
Cyclin-Dependent Kinase 2 / antagonists & inhibitors*
Cyclin-Dependent Kinase 2 / chemistry
Databases, Protein
Datasets as Topic
Drug Design
Humans
Inhibitory Concentration 50
Ligands
Molecular Docking Simulation
Protein Kinase Inhibitors / chemistry*
ROC Curve
Supervised Machine Learning*
Thermodynamics

Substances

Antineoplastic Agents
Ligands
Protein Kinase Inhibitors
CDK2 protein, human
Cyclin-Dependent Kinase 2