Fast consensus clustering in complex networks

Phys Rev E. 2019 Apr;99(4-1):042301. doi: 10.1103/PhysRevE.99.042301.

Abstract

Algorithms for community detection are usually stochastic, leading to different partitions for different choices of random seeds. Consensus clustering has proven to be an effective technique to derive more stable and accurate partitions than the ones obtained by the direct application of the algorithm. However, the procedure requires the calculation of the consensus matrix, which can be quite dense if (some of) the clusters of the input partitions are large. Consequently, the complexity can get dangerously close to quadratic, which makes the technique inapplicable on large graphs. Here, we present a fast variant of consensus clustering, which calculates the consensus matrix only on the links of the original graph and on a comparable number of additional node pairs, suitably chosen. This brings the complexity down to linear, while the performance remains comparable as the full technique. Therefore, our fast consensus clustering procedure can be applied on networks with millions of nodes and links.