Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering

Entropy (Basel). 2023 Jun 16;25(6):946. doi: 10.3390/e25060946.

Abstract

Data clustering is one of the most influential branches of machine learning and data analysis, and Gaussian Mixture Models (GMMs) are frequently adopted in data clustering due to their ease of implementation. However, there are certain limitations to this approach that need to be acknowledged. GMMs need to determine the cluster numbers manually, and they may fail to extract the information within the dataset during initialization. To address these issues, a new clustering algorithm called PFA-GMM has been proposed. PFA-GMM is based on GMMs and the Pathfinder algorithm (PFA), and it aims to overcome the shortcomings of GMMs. The algorithm automatically determines the optimal number of clusters based on the dataset. Subsequently, PFA-GMM considers the clustering problem as a global optimization problem for getting trapped in local convergence during initialization. Finally, we conducted a comparative study of our proposed clustering algorithm against other well-known clustering algorithms using both synthetic and real-world datasets. The results of our experiments indicate that PFA-GMM outperformed the competing approaches.

Keywords: Gaussian Mixture Models; clustering; metaheuristic algorithm; pathfinder algorithm.