Drug-Protein interaction prediction by correcting the effect of incomplete information in heterogeneous information

Bioinformatics. 2022 Nov 15;38(22):5073-5080. doi: 10.1093/bioinformatics/btac629.

Abstract

Motivation: Large-scale heterogeneous data provide diverse perspectives for predicting drug-protein interactions (DPIs). However, the available information on molecular interactions and clinical associations related to drugs or proteins is incomplete because there may be unproven interactions and associations. This incomplete information in the available data is presented in the form of non-interaction and non-correlation, which may mislead the prediction model. Existing methods fuse incomplete and complete information without considering their integrity, so the negative effects of incomplete information still exist.

Results: We develop a network-based DPI prediction method named BRWCP, which uses the complete information network to correct the prediction results acquired by the incomplete information network. By integrating relevant heterogeneous information that may be incomplete, the feature similarities of drugs and proteins are obtained. Combining the feature similarities and known DPIs, an incomplete information-based drug-protein heterogeneous network is constructed. Then, a bidirectional random walk with pruning algorithm is adopted in this heterogeneous network to predict potential DPIs. Next, the predicted DPIs are combined with the chemical fingerprint similarity of drugs and amino acid sequence similarity of proteins to construct the complete information network. The bidirectional random walk with pruning algorithm is applied in the new network to obtain the final prediction results until it converges. Experimental results show that BRWCP is superior to several state-of-the-art DPI prediction methods, and case studies further confirm its ability to tap potential DPIs.

Availability and implementation: The code and data used in BRWCP are available at https://github.com/lyfdomain/BRWCP.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology* / methods
  • Drug Interactions
  • Proteins

Substances

  • Proteins