Error-Aware Data Clustering for In-Network Data Reduction in Wireless Sensor Networks

Sensors (Basel). 2020 Feb 13;20(4):1011. doi: 10.3390/s20041011.

Abstract

A wireless sensor network (WSN) deploys hundreds or thousands of nodes that may introduce large-scale data over time. Dealing with such an amount of collected data is a real challenge for energy-constraint sensor nodes. Therefore, numerous research works have been carried out to design efficient data clustering techniques in WSNs to eliminate the amount of redundant data before transmitting them to the sink while preserving their fundamental properties. This paper develops a new error-aware data clustering (EDC) technique at the cluster-heads (CHs) for in-network data reduction. The proposed EDC consists of three adaptive modules that allow users to choose the module that suits their requirements and the quality of the data. The histogram-based data clustering (HDC) module groups temporal correlated data into clusters and eliminates correlated data from each cluster. Recursive outlier detection and smoothing (RODS) with HDC module provides error-aware data clustering, which detects random outliers using temporal correlation of data to maintain data reduction errors within a predefined threshold. Verification of RODS (V-RODS) with HDC module detects not only random outliers but also frequent outliers simultaneously based on both the temporal and spatial correlations of the data. The simulation results show that the proposed EDC is computationally cheap, able to reduce a significant amount of redundant data with minimum error, and provides efficient error-aware data clustering solutions for remote monitoring environmental applications.

Keywords: environmental monitoring; in-network data reduction; k-means; k-medoids; outlier detection; partitional clustering; time-series clustering; wireless sensor network.