Disentangling categorical relationships through a graph of co-occurrences

Phys Rev E Stat Nonlin Soft Matter Phys. 2011 Oct;84(4 Pt 2):046108. doi: 10.1103/PhysRevE.84.046108. Epub 2011 Oct 19.

Abstract

The mesoscopic structure of complex networks has proven a powerful level of description to understand the linchpins of the system represented by the network. Nevertheless, the mapping of a series of relationships between elements, in terms of a graph, is sometimes not straightforward. Given that all the information we would extract using complex network tools depend on this initial graph, it is mandatory to preprocess the data to build it on in the most accurate manner. Here we propose a procedure to build a network, attending only to statistically significant relations between constituents. We use a paradigmatic example of word associations to show the development of our approach. Analyzing the modular structure of the obtained network we are able to disentangle categorical relations, disambiguating words with success that is comparable to the best algorithms designed to the same end.