Integrating multi-scale neighbouring topologies and cross-modal similarities for drug-protein interaction prediction

Brief Bioinform. 2021 Sep 2;22(5):bbab119. doi: 10.1093/bib/bbab119.

Abstract

Motivation: Identifying the proteins that interact with drugs can reduce the cost and time of drug development. Existing computerized methods focus on integrating drug-related and protein-related data from multiple sources to predict candidate drug-target interactions (DTIs). However, multi-scale neighboring node sequences and various kinds of drug and protein similarities are neither fully explored nor considered in decision making.

Results: We propose a drug-target interaction prediction method, DTIP, to encode and integrate multi-scale neighbouring topologies, multiple kinds of similarities, associations, interactions related to drugs and proteins. We firstly construct a three-layer heterogeneous network to represent interactions and associations across drug, protein, and disease nodes. Then a learning framework based on fully-connected autoencoder is proposed to learn the nodes' low-dimensional feature representations within the heterogeneous network. Secondly, multi-scale neighbouring sequences of drug and protein nodes are formulated by random walks. A module based on bidirectional gated recurrent unit is designed to learn the neighbouring sequential information and integrate the low-dimensional features of nodes. Finally, we propose attention mechanisms at feature level, neighbouring topological level and similarity level to learn more informative features, topologies and similarities. The prediction results are obtained by integrating neighbouring topologies, similarities and feature attributes using a multiple layer CNN. Comprehensive experimental results over public dataset demonstrated the effectiveness of our innovative features and modules. Comparison with other state-of-the-art methods and case studies of five drugs further validated DTIP's ability in discovering the potential candidate drug-related proteins.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology / methods*
  • Drug Development / methods
  • Humans
  • Machine Learning*
  • Models, Theoretical*
  • Pharmaceutical Preparations / chemistry
  • Pharmaceutical Preparations / metabolism*
  • Protein Binding
  • Proteins / chemistry
  • Proteins / metabolism*
  • Reproducibility of Results
  • Support Vector Machine

Substances

  • Pharmaceutical Preparations
  • Proteins