Revisiting Initializing Then Refining: An Incomplete and Missing Graph Imputation Network

IEEE Trans Neural Netw Learn Syst. 2024 Jan 12:PP. doi: 10.1109/TNNLS.2024.3349850. Online ahead of print.

Abstract

With the development of various applications, such as recommendation systems and social network analysis, graph data have been ubiquitous in the real world. However, graphs usually suffer from being absent during data collection due to copyright restrictions or privacy-protecting policies. The graph absence could be roughly grouped into attribute-incomplete and attribute-missing cases. Specifically, attribute-incomplete indicates that a portion of the attribute vectors of all nodes are incomplete, while attribute-missing indicates that all attribute vectors of partial nodes are missing. Although various graph imputation methods have been proposed, none of them is custom-designed for a common situation where both types of graph absence exist simultaneously. To fill this gap, we develop a novel graph imputation network termed revisiting initializing then refining (RITR), where both attribute-incomplete and attribute-missing samples are completed under the guidance of a novel initializing-then-refining imputation criterion. Specifically, to complete attribute-incomplete samples, we first initialize the incomplete attributes using Gaussian noise before network learning, and then introduce a structure-attribute consistency constraint to refine incomplete values by approximating a structure-attribute correlation matrix to a high-order structure matrix. To complete attribute-missing samples, we first adopt structure embeddings of attribute-missing samples as the embedding initialization, and then refine these initial values by adaptively aggregating the reliable information of attribute-incomplete samples according to a dynamic affinity structure. To the best of our knowledge, this newly designed method is the first end-to-end unsupervised framework dedicated to handling hybrid-absent graphs. Extensive experiments on six datasets have verified that our methods consistently outperform the existing state-of-the-art competitors. Our source code is available at https://github.com/WxTu/RITR.