An ensemble based approach using a combination of clustering and classification algorithms to enhance customer churn prediction in telecom industry

PeerJ Comput Sci. 2022 Feb 22:8:e854. doi: 10.7717/peerj-cs.854. eCollection 2022.

Abstract

Mobile communication has become a dominant medium of communication over the past two decades. New technologies and competitors are emerging rapidly and churn prediction has become a great concern for telecom companies. A customer churn prediction model can provide the accurate identification of potential churners so that a retention solution may be provided to them. The proposed churn prediction model is a hybrid model that is based on a combination of clustering and classification algorithms using an ensemble. First, different clustering algorithms (i.e. K-means, K-medoids, X-means and random clustering) were evaluated individually on two churn prediction datasets. Then hybrid models were introduced by combining the clusters with seven different classification algorithms individually and then evaluations were performed using ensembles. The proposed research was evaluated on two different benchmark telecom data sets obtained from GitHub and Bigml platforms. The analysis of results indicated that the proposed model attained the highest prediction accuracy of 94.7% on the GitHub dataset and 92.43% on the Bigml dataset. State of the art comparison was also performed using the proposed model. The proposed model performed significantly better than state of the art churn prediction models.

Keywords: Churn prediction; Classification; Clustering; Decision support system; Hybrid model.

Associated data

  • figshare/10.6084/m9.figshare.18130610.v1

Grants and funding

The authors received no funding for this work.