Empirically classifying network mechanisms

Ryan E Langendorf; Matthew G Burgess

doi:10.1038/s41598-021-99251-7

Empirically classifying network mechanisms

Sci Rep. 2021 Oct 15;11(1):20501. doi: 10.1038/s41598-021-99251-7.

Authors

Ryan E Langendorf¹, Matthew G Burgess^{2

3

4}

Affiliations

¹ Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, 216 UCB, Boulder, CO, 80309, USA. ryan.langendorf@colorado.edu.
² Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, 216 UCB, Boulder, CO, 80309, USA.
³ Environmental Studies Program, University of Colorado Boulder, 397 UCB, Boulder, CO, 80303, USA.
⁴ Department of Economics, University of Colorado Boulder, 256 UCB, Boulder, CO, 80309, USA.

Abstract

Network data are often explained by assuming a generating mechanism and estimating related parameters. Without a way to test the relevance of assumed mechanisms, conclusions from such models may be misleading. Here we introduce a simple empirical approach to mechanistically classify arbitrary network data as originating from any of a set of candidate mechanisms or none of them. We tested our approach on simulated data from five of the most widely studied network mechanisms, and found it to be highly accurate. We then tested 1284 empirical networks spanning 17 different kinds of systems against these five widely studied mechanisms. We found that 387 (30%) of these empirical networks were classified as unlike any of the mechanisms, and only 1% or fewer of the networks classified as each of the mechanisms for which our approach was most sensitive. Based on this, we use Bayes' theorem to show that most of the 70% of empirical networks our approach classified as a mechanism could be false positives, because of the high sensitivity required of a test to detect rarely occurring mechanisms. Thus, it is possible that very few of our empirical networks are described by any of these five widely studied mechanisms. Additionally, 93 networks (7%) were classified as plausibly being governed by each of multiple mechanisms. This raises the possibility that some systems are governed by mixtures of mechanisms. We show that mixtures are often unidentifiable because different mixtures can produce structurally equivalent networks, but that we can still accurately predict out-of-sample functional properties.

Publication types

Research Support, Non-U.S. Gov't