Link community detection using generative model and nonnegative matrix factorization

PLoS One. 2014 Jan 28;9(1):e86899. doi: 10.1371/journal.pone.0086899. eCollection 2014.

Abstract

Discovery of communities in complex networks is a fundamental data analysis problem with applications in various domains. While most of the existing approaches have focused on discovering communities of nodes, recent studies have shown the advantages and uses of link community discovery in networks. Generative models provide a promising class of techniques for the identification of modular structures in networks, but most generative models mainly focus on the detection of node communities rather than link communities. In this work, we propose a generative model, which is based on the importance of each node when forming links in each community, to describe the structure of link communities. We proceed to fit the model parameters by taking it as an optimization problem, and solve it using nonnegative matrix factorization. Thereafter, in order to automatically determine the number of communities, we extend the above method by introducing a strategy of iterative bipartition. This extended method not only finds the number of communities all by itself, but also obtains high efficiency, and thus it is more suitable to deal with large and unexplored real networks. We test this approach on both synthetic benchmarks and real-world networks including an application on a large biological network, and compare it with two highly related methods. Results demonstrate the superior performance of our approach over competing methods for the detection of link communities.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Community Networks*
  • Humans
  • Models, Theoretical*
  • Protein Interaction Maps
  • Residence Characteristics*
  • Saccharomyces cerevisiae / metabolism
  • Saccharomyces cerevisiae Proteins / metabolism

Substances

  • Saccharomyces cerevisiae Proteins

Grants and funding

This work is supported by Major State Basic Research Development Program of China (2013CB329301), National Natural Science Foundation of China (61303110, 61133011, 61373053, 61070089, 61373165, 61202308), PhD Programs Foundation of Ministry of Education of China (20130032120043), Open Project Program of Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education (93K172013K02), Innovation Foundation of Tianjin University (60302034), the TECHNO II project within Erasmus Mundus Programme of European Union, and the China Scholarship Council (award to Dongxiao He for one year’s study abroad at Washington University in St Louis). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.