Resampling effects on significance analysis of network clustering and ranking

PLoS One. 2013;8(1):e53943. doi: 10.1371/journal.pone.0053943. Epub 2013 Jan 23.

Abstract

Community detection helps us simplify the complex configuration of networks, but communities are reliable only if they are statistically significant. To detect statistically significant communities, a common approach is to resample the original network and analyze the communities. But resampling assumes independence between samples, while the components of a network are inherently dependent. Therefore, we must understand how breaking dependencies between resampled components affects the results of the significance analysis. Here we use scientific communication as a model system to analyze this effect. Our dataset includes citations among articles published in journals in the years 1984-2010. We compare parametric resampling of citations with non-parametric article resampling. While citation resampling breaks link dependencies, article resampling maintains such dependencies. We find that citation resampling underestimates the variance of link weights. Moreover, this underestimation explains most of the differences in the significance analysis of ranking and clustering. Therefore, when only link weights are available and article resampling is not an option, we suggest a simple parametric resampling scheme that generates link-weight variances close to the link-weight variances of article resampling. Nevertheless, when we highlight and summarize important structural changes in science, the more dependencies we can maintain in the resampling scheme, the earlier we can predict structural change.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bibliometrics
  • Cluster Analysis
  • Humans
  • Models, Statistical*
  • Publishing / statistics & numerical data*

Grants and funding

MR was supported by the Swedish Research Council grant 2009-5344. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.