Distributed Optimization of Graph Convolutional Network Using Subgraph Variance

IEEE Trans Neural Netw Learn Syst. 2023 Feb 22:PP. doi: 10.1109/TNNLS.2023.3243904. Online ahead of print.

Abstract

In recent years, distributed graph convolutional networks (GCNs) training frameworks have achieved great success in learning the representation of graph-structured data with large sizes. However, existing distributed GCN training frameworks require enormous communication costs since a multitude of dependent graph data need to be transmitted from other processors. To address this issue, we propose a graph augmentation-based distributed GCN framework (GAD). In particular, GAD has two main components: GAD-Partition and GAD-Optimizer . We first propose an augmentation-based graph partition (GAD-Partition) that can divide the input graph into augmented subgraphs to reduce communication by selecting and storing as few significant vertices of other processors as possible. To further speed up distributed GCN training and improve the quality of the training result, we design a subgraph variance-based importance calculation formula and propose a novel weighted global consensus method, collectively referred to as GAD-Optimizer . This optimizer adaptively adjusts the importance of subgraphs to reduce the effect of extra variance introduced by GAD-Partition on distributed GCN training. Extensive experiments on four large-scale real-world datasets demonstrate that our framework significantly reduces the communication overhead ( ≈ 50% ), improves the convergence speed ( ≈ 2 × ) of distributed GCN training, and obtains a slight gain in accuracy ( ≈ 0.45% ) based on minimal redundancy compared to the state-of-the-art methods.