HeteEdgeWalk: A Heterogeneous Edge Memory Random Walk for Heterogeneous Information Network Embedding

Entropy (Basel). 2023 Jun 29;25(7):998. doi: 10.3390/e25070998.

Abstract

Most Heterogeneous Information Network (HIN) embedding methods use meta-paths to guide random walks to sample from HIN and perform representation learning in order to overcome the bias of traditional random walks that are more biased towards high-order nodes. Their performance depends on the suitability of the generated meta-paths for the current HIN. The definition of meta-paths requires domain expertise, which makes the results overly dependent on the meta-paths. Moreover, it is difficult to represent the structure of complex HIN with a single meta-path. In a meta-path guided random walk, some of the heterogeneous structures (e.g., node type(s)) are not among the node types specified by the meta-path, making this heterogeneous information ignored. In this paper, HeteEdgeWalk, a solution method that does not involve meta-paths, is proposed. We design a dynamically adjusted bidirectional edge-sampling walk strategy. Specifically, edge sampling and the storage of recently selected edge types are used to better sample the network structure in a more balanced and comprehensive way. Finally, node classification and clustering experiments are performed on four real HINs with in-depth analysis. The results show a maximum performance improvement of 2% in node classification and at least 0.6% in clustering compared to baselines. This demonstrates the superiority of the method to effectively capture semantic information from HINs.

Keywords: edge sampling; heterogeneous information network; network embeddings; random walk; representation learning.