Drug-disease association prediction with literature based multi-feature fusion

Front Pharmacol. 2023 May 22:14:1205144. doi: 10.3389/fphar.2023.1205144. eCollection 2023.

Abstract

Introduction: Exploring the potential efficacy of a drug is a valid approach for drug development with shorter development times and lower costs. Recently, several computational drug repositioning methods have been introduced to learn multi-features for potential association prediction. However, fully leveraging the vast amount of information in the scientific literature to enhance drug-disease association prediction is a great challenge. Methods: We constructed a drug-disease association prediction method called Literature Based Multi-Feature Fusion (LBMFF), which effectively integrated known drugs, diseases, side effects and target associations from public databases as well as literature semantic features. Specifically, a pre-training and fine-tuning BERT model was introduced to extract literature semantic information for similarity assessment. Then, we revealed drug and disease embeddings from the constructed fusion similarity matrix by a graph convolutional network with an attention mechanism. Results: LBMFF achieved superior performance in drug-disease association prediction with an AUC value of 0.8818 and an AUPR value of 0.5916. Discussion: LBMFF achieved relative improvements of 31.67% and 16.09%, respectively, over the second-best results, compared to single feature methods and seven existing state-of-the-art prediction methods on the same test datasets. Meanwhile, case studies have verified that LBMFF can discover new associations to accelerate drug development. The proposed benchmark dataset and source code are available at: https://github.com/kang-hongyu/LBMFF.

Keywords: association prediction; disease; drug; drug repositioning; literature; multi-feature fusion.

Grants and funding

This work was supported by The National Social Science Fund of China (22CTQ024), Innovation Project of Chinese Academy of Medical Sciences (2021-I2M-1-001, 2021-I2M-1-056), The National Key Research and Development Program of China (2022YFB2702801), China Knowledge Center for Engineering Sciences and Technology (Medical Knowledge Service System) (CKCEST-2022-1-6).