BACPI: a bi-directional attention neural network for compound-protein interaction and binding affinity prediction

Bioinformatics. 2022 Mar 28;38(7):1995-2002. doi: 10.1093/bioinformatics/btac035.

Abstract

Motivation: The identification of compound-protein interactions (CPIs) is an essential step in the process of drug discovery. The experimental determination of CPIs is known for a large amount of funds and time it consumes. Computational model has therefore become a promising and efficient alternative for predicting novel interactions between compounds and proteins on a large scale. Most supervised machine learning prediction models are approached as a binary classification problem, which aim to predict whether there is an interaction between the compound and the protein or not. However, CPI is not a simple binary on-off relationship, but a continuous value reflects how tightly the compound binds to a particular target protein, also called binding affinity.

Results: In this study, we propose an end-to-end neural network model, called BACPI, to predict CPI and binding affinity. We employ graph attention network and convolutional neural network (CNN) to learn the representations of compounds and proteins and develop a bi-directional attention neural network model to integrate the representations. To evaluate the performance of BACPI, we use three CPI datasets and four binding affinity datasets in our experiments. The results show that, when predicting CPIs, BACPI significantly outperforms other available machine learning methods on both balanced and unbalanced datasets. This suggests that the end-to-end neural network model that predicts CPIs directly from low-level representations is more robust than traditional machine learning-based methods. And when predicting binding affinities, BACPI achieves higher performance on large datasets compared to other state-of-the-art deep learning methods. This comparison result suggests that the proposed method with bi-directional attention neural network can capture the important regions of compounds and proteins for binding affinity prediction.

Availability and implementation: Data and source codes are available at https://github.com/CSUBioGroup/BACPI.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Drug Discovery / methods
  • Machine Learning
  • Neural Networks, Computer*
  • Proteins / chemistry
  • Software*

Substances

  • Proteins