Transfer-Learning-Based Gaussian Mixture Model for Distributed Clustering

Rongrong Wang; Shiyuan Han; Jin Zhou; Yuehui Chen; Lin Wang; Tao Du; Ke Ji; Ya-Ou Zhao; Kun Zhang

doi:10.1109/TCYB.2022.3177242

Transfer-Learning-Based Gaussian Mixture Model for Distributed Clustering

IEEE Trans Cybern. 2023 Nov;53(11):7058-7070. doi: 10.1109/TCYB.2022.3177242. Epub 2023 Oct 17.

Authors

Rongrong Wang, Shiyuan Han, Jin Zhou, Yuehui Chen, Lin Wang, Tao Du, Ke Ji, Ya-Ou Zhao, Kun Zhang

PMID: 35687639
DOI: 10.1109/TCYB.2022.3177242

Abstract

Distributed clustering based on the Gaussian mixture model (GMM) has exhibited excellent clustering capabilities in peer-to-peer (P2P) networks. However, more iterative numbers and communication overhead are required to achieve the consensus in existing distributed GMM clustering algorithms. In addition, the truth that it cannot find a closed form for the update of parameters in GMM causes the imprecise clustering accuracy. To solve these issues, by utilizing the transfer learning technique, a general transfer distributed GMM clustering framework is exploited to promote the clustering performance and accelerate the clustering convergence. In this work, each node is treated as both the source domain and the target domain, and these nodes can learn from each other to complete the clustering task in distributed P2P networks. Based on this framework, the transfer distributed expectation-maximization algorithm with the fixed learning rate is first presented for data clustering. Then, an improved version is designed to obtain the stable clustering accuracy, in which an adaptive transfer learning strategy is adopted to adjust the learning rate automatically instead of a fixed value. To demonstrate the extensibility of the proposed framework, a representative GMM clustering method, the entropy-type classification maximum-likelihood algorithm, is further extended to the transfer distributed counterpart. Experimental results verify the effectiveness of the presented algorithms in contrast with the existing GMM clustering approaches.