multi-type neighbors enhanced global topology and pairwise attribute learning for drug-protein interaction prediction

Brief Bioinform. 2022 Sep 20;23(5):bbac120. doi: 10.1093/bib/bbac120.

Abstract

Motivation: Accurate identification of proteins interacted with drugs helps reduce the time and cost of drug development. Most of previous methods focused on integrating multisource data about drugs and proteins for predicting drug-target interactions (DTIs). There are both similarity connection and interaction connection between two drugs, and these connections reflect their relationships from different perspectives. Similarly, two proteins have various connections from multiple perspectives. However, most of previous methods failed to deeply integrate these connections. In addition, multiple drug-protein heterogeneous networks can be constructed based on multiple kinds of connections. The diverse topological structures of these networks are still not exploited completely.

Results: We propose a novel model to extract and integrate multi-type neighbor topology information, diverse similarities and interactions related to drugs and proteins. Firstly, multiple drug-protein heterogeneous networks are constructed according to multiple kinds of connections among drugs and those among proteins. The multi-type neighbor node sequences of a drug node (or a protein node) are formed by random walks on each network and they reflect the hidden neighbor topological structure of the node. Secondly, a module based on graph neural network (GNN) is proposed to learn the multi-type neighbor topologies of each node. We propose attention mechanisms at neighbor node level and at neighbor type level to learn more informative neighbor nodes and neighbor types. A network-level attention is also designed to enhance the context dependency among multiple neighbor topologies of a pair of drug and protein nodes. Finally, the attribute embedding of the drug-protein pair is formulated by a proposed embedding strategy, and the embedding covers the similarities and interactions about the pair. A module based on three-dimensional convolutional neural networks (CNN) is constructed to deeply integrate pairwise attributes. Extensive experiments have been performed and the results indicate GCDTI outperforms several state-of-the-art prediction methods. The recall rate estimation over the top-ranked candidates and case studies on 5 drugs further demonstrate GCDTI's ability in discovering potential drug-protein interactions.

Keywords: attentions at neighbor-type level and at network level; drug–protein interaction; graph neural network with neighbor node-level attention; multi-type neighbors enhanced global topology; multiple drug–protein heterogenous networks; three-dimensional convolutional neural network.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Drug Development
  • Drug Interactions
  • Learning
  • Neural Networks, Computer*