Dynamic Sub-Swarm Approach of PSO Algorithms for Text Document Clustering

Sensors (Basel). 2022 Dec 9;22(24):9653. doi: 10.3390/s22249653.

Abstract

Text document clustering is one of the data mining techniques used in many real-world applications such as information retrieval from IoT Sensors data, duplicate content detection, and document organization. Swarm intelligence (SI) algorithms are suitable for solving complex text document clustering problems compared to traditional clustering algorithms. The previous studies show that in SI algorithms, particle swarm optimization (PSO) provides an effective solution to text document clustering problems. This PSO still needs to be improved to avoid the problems such as premature convergence to local optima. In this paper, an approach called dynamic sub-swarm of PSO (subswarm-PSO) is proposed to improve the results of PSO for text document clustering problems and avoid the local optimum by improving the global search capabilities of PSO. The results of this proposed approach were compared with the standard PSO algorithm and K-means algorithm. As for performance assurance, the evaluation metric purity is used with six benchmark data sets. The experimental results of this study show that our proposed subswarm-PSO algorithm performs best with high purity comparing the standard PSO and K-means traditional algorithms and also the execution time of subswarm-PSO comparatively takes a little less than the standard PSO algorithm.

Keywords: particle swarm optimization; sub-swarm PSO; swarm intelligence; text document clustering.

Grants and funding

This research was supported by the BK21 FOUR (Fostering Outstanding Universities for Research) 5120200213791, funded by the Ministry of Education (MOE, Korea) and the National Research Foundation of Korea (NRF).