Principled approach to the selection of the embedding dimension of networks

Weiwei Gu; Aditya Tandon; Yong-Yeol Ahn; Filippo Radicchi

doi:10.1038/s41467-021-23795-5

Principled approach to the selection of the embedding dimension of networks

Nat Commun. 2021 Jun 18;12(1):3772. doi: 10.1038/s41467-021-23795-5.

Authors

Weiwei Gu¹, Aditya Tandon², Yong-Yeol Ahn^{2

3

4}, Filippo Radicchi⁵

Affiliations

¹ UrbanNet Lab, College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, P. R. China.
² Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, USA.
³ Network Science Institute, Indiana University, Bloomington (IUNI), IN, USA.
⁴ Connection Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
⁵ Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, USA. filiradi@indiana.edu.

Abstract

Network embedding is a general-purpose machine learning technique that encodes network structure in vector spaces with tunable dimension. Choosing an appropriate embedding dimension - small enough to be efficient and large enough to be effective - is challenging but necessary to generate embeddings applicable to a multitude of tasks. Existing strategies for the selection of the embedding dimension rely on performance maximization in downstream tasks. Here, we propose a principled method such that all structural information of a network is parsimoniously encoded. The method is validated on various embedding algorithms and a large corpus of real-world networks. The embedding dimension selected by our method in real-world networks suggest that efficient encoding in low-dimensional spaces is usually possible.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.