Environmental air pollution clustering using enhanced ensemble clustering methodology

Environ Sci Pollut Res Int. 2021 Aug;28(30):40746-40755. doi: 10.1007/s11356-020-09962-z. Epub 2020 Jul 6.

Abstract

Air pollution these days could cause severe effects on human health. As human health is crumbled with serious respiratory or other lung diseases, it is prominent to study air pollution. One of the ways to address this issue is by applying clustering techniques. The two main important problems that are faced in the clustering algorithm are, firstly, the exact shape of the cluster and the number of clusters that input data can produce. Secondly, choosing an appropriate algorithm for a particular problem is not clearly known. Finally, multiple replications of the same algorithm lead to alternative solutions due to the fact such as random initialization of cluster heads. Ensembling algorithms can handle these problems and overcome bias and variance in the traditional clustering process. An adequate study has not been carried out in the ensembling approach mainly for clustering. In this paper, we use an enhanced ensemble clustering method to cluster the pollution data levels. This study helps to take preventive measures that are needed to control further contamination, reduce the alarming levels, and analyze the results to find healthy and unhealthy regions in a given area. This ensemble technique also explains about uncertain objects that are found in clustering. The distinct advantage of this algorithm is that there is no requirement of prior information about the data. This experiment shows that the implemented ensemble consensus clustering has demonstrated improved performance when compared with basic clustering algorithms.

Keywords: Air pollution; Cluster certainty; Consensus functions; Ensemble clustering; Ensemble members; Similarity matrix.

MeSH terms

  • Air Pollution*
  • Algorithms*
  • Cluster Analysis
  • Environmental Pollution
  • Humans