Precision as a measure of predictability of missing links in real networks

Guillermo García-Pérez; Roya Aliakbarisani; Abdorasoul Ghasemi; M Ángeles Serrano

doi:10.1103/PhysRevE.101.052318

Precision as a measure of predictability of missing links in real networks

Phys Rev E. 2020 May;101(5-1):052318. doi: 10.1103/PhysRevE.101.052318.

Authors

Guillermo García-Pérez^{1

2}, Roya Aliakbarisani³, Abdorasoul Ghasemi³, M Ángeles Serrano^{4

5

6}

Affiliations

¹ QTF Centre of Excellence, Turku Centre for Quantum Physics, Department of Physics and Astronomy, University of Turku, FI-20014 Turun Yliopisto, Finland.
² Complex Systems Research Group, Department of Mathematics and Statistics, University of Turku, FI-20014 Turun Yliopisto, Finland.
³ Faculty of Computer Engineering, K. N. Toosi University of Technology, Tehran 1631714191, Iran.
⁴ Departament de Física de la Matèria Condensada, Universitat de Barcelona, Martí i Franquès 1, 08028 Barcelona, Spain.
⁵ Universitat de Barcelona Institute of Complex Systems (UBICS), Universitat de Barcelona, Barcelona, Spain.
⁶ ICREA, Pg. Lluís Companys 23, E-08010 Barcelona, Spain.

PMID: 32575233
DOI: 10.1103/PhysRevE.101.052318

Abstract

Predicting missing links in real networks is an important open problem in network science to which considerable efforts have been devoted, giving as a result a vast plethora of link prediction methods in the literature. In this work, we take a different point of view on the problem and focus on predictability instead of prediction. By considering ensembles defined by well-known network models, we prove analytically that even the best possible link prediction method, given by the ensemble connection probabilities, yields a limited precision that depends quantitatively on the topological properties-such as degree heterogeneity, clustering, and community structure-of the ensemble. This suggests an absolute limitation to the predictability of missing links in real networks, due to the irreducible uncertainty arising from the random nature of link formation processes. We show that a predictability limit can be estimated in real networks, and we propose a method to approximate such a bound from real-world networks with missing links. The predictability limit gives a benchmark to gauge the quality of link prediction methods in real networks.