Reverse Nearest Neighbor Search on a Protein-Protein Interaction Network to Infer Protein-Disease Associations

Bioinform Biol Insights. 2017 Jul 13:11:1177932217720405. doi: 10.1177/1177932217720405. eCollection 2017.

Abstract

The associations between proteins and diseases are crucial information for investigating pathological mechanisms. However, the number of known and reliable protein-disease associations is quite small. In this study, an analysis framework to infer associations between proteins and diseases was developed based on a large data set of a human protein-protein interaction network integrating an effective network search, namely, the reverse k-nearest neighbor (RkNN) search. The RkNN search was used to identify an impact of a protein on other proteins. Then, associations between proteins and diseases were inferred statistically. The method using the RkNN search yielded a much higher precision than a random selection, standard nearest neighbor search, or when applying the method to a random protein-protein interaction network. All protein-disease pair candidates were verified by a literature search. Supporting evidence for 596 pairs was identified. In addition, cluster analysis of these candidates revealed 10 promising groups of diseases to be further investigated experimentally. This method can be used to identify novel associations to better understand complex relationships between proteins and diseases.

Keywords: network-based method; protein-disease associations; reverse nearest neighbor search.