Detecting non-uniform clusters in large-scale interaction graphs

J Comput Biol. 2014 Feb;21(2):173-83. doi: 10.1089/cmb.2013.0095. Epub 2013 Sep 19.

Abstract

Graph clustering becomes difficult as the graph size and complexity increase. In particular, in interaction graphs, the clusters are small and the data on the underlying interaction are not only complex, but also noisy due to the lack of information and experimental errors. The graphs representing such data consist of (possibly overlapping) clusters of non-uniform size with some false positive and false negative links. In this article, we propose a new approach, assuming that clusters in the graphs of protein-protein interaction (PPI) networks resemble corrupted cliques. Therefore, the problem can be reduced to looking for clusters only among nodes of approximately similar degrees. This idea was implemented using a soft version of the Farthest-Point-First (FPF) clustering algorithm with the Jaccard distance function modified to perform on slightly overlapping clusters. The StripClust program developed by us was tested on a synthetic network and on the yeast PPI network.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Computer Graphics*
  • Protein Interaction Mapping / methods*
  • Saccharomyces cerevisiae / metabolism*
  • Saccharomyces cerevisiae Proteins / metabolism*

Substances

  • Saccharomyces cerevisiae Proteins