MultiSourcDSim: an integrated approach for exploring disease similarity

BMC Med Inform Decis Mak. 2019 Dec 19;19(Suppl 6):269. doi: 10.1186/s12911-019-0968-8.

Abstract

Background: A collection of disease-associated data contributes to study the association between diseases. Discovering closely related diseases plays a crucial role in revealing their common pathogenic mechanisms. This might further imply treatment that can be appropriated from one disease to another. During the past decades, a number of approaches for calculating disease similarity have been developed. However, most of them are designed to take advantage of single or few data sources, which results in their low accuracy.

Methods: In this paper, we propose a novel method, called MultiSourcDSim, to calculate disease similarity by integrating multiple data sources, namely, gene-disease associations, GO biological process-disease associations and symptom-disease associations. Firstly, we establish three disease similarity networks according to the three disease-related data sources respectively. Secondly, the representation of each node is obtained by integrating the three small disease similarity networks. In the end, the learned representations are applied to calculate the similarity between diseases.

Results: Our approach shows the best performance compared to the other three popular methods. Besides, the similarity network built by MultiSourcDSim suggests that our method can also uncover the latent relationships between diseases.

Conclusions: MultiSourcDSim is an efficient approach to predict similarity between diseases.

Keywords: Diffusion component analysis; Disease similarity network; Integrating multiple data sources.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology / methods*
  • Disease / classification*
  • Humans