Bioactivity predictions and virtual screening using machine learning predictive model

J Biomol Struct Dyn. 2024 Jan 12:1-20. doi: 10.1080/07391102.2023.2300132. Online ahead of print.

Abstract

Recently, there has been significant attention on machine learning algorithms for predictive modeling. Prediction models for enzyme inhibitors are limited, and it is essential to account for chemical biases while developing them. The lack of repeatability in available models and chemical bias issues constrain drug discovery and development. A new prediction model for enzyme inhibitors has been developed, and the model efficacy was checked using Dipeptidyl peptidase 4 (DPP-4) inhibitors. A Python script was prepared and can be provided for personal use upon request. Among various machine learning algorithms, it was found that Random Forest offers the best accuracy. Two models were compared, one with diverse training and test data and the other with a random split. It was concluded that machine learning predictive models based on the Murcko scaffold can address chemical bias concerns. In-silico screening of the Drug Bank database identified two molecules against DPP-4, which are previously proven hit molecules. The approach was further validated through molecular docking studies and molecular dynamics simulations, demonstrating the credibility and relevance of the developed model for future investigations and potential translation into clinical applications.Communicated by Ramaswamy H. Sarma.

Keywords: DPP-4 inhibitors; Machine learning predictive model; mMGBSA; molecular docking; molecular dynamics simulation.