FedLGAN: a method for anomaly detection and repair of hydrological telemetry data based on federated learning

PeerJ Comput Sci. 2023 Nov 7:9:e1664. doi: 10.7717/peerj-cs.1664. eCollection 2023.

Abstract

The existing data repair methods primarily focus on addressing missing data issues by utilizing variational autoencoders to learn the underlying distribution and generate content that represents the missing parts, thus achieving data repair. However, this method is only applicable to data missing problems and cannot identify abnormal data. Additionally, as data privacy concerns continue to gain public attention, it poses a challenge to traditional methods. This article proposes a generative adversarial network (GAN) model based on the federated learning framework and a long short-term memory network, namely the FedLGAN model, to achieve anomaly detection and repair of hydrological telemetry data. In this model, the discriminator in the GAN structure is employed for anomaly detection, while the generator is utilized for abnormal data repair. Furthermore, to capture the temporal features of the original data, a bidirectional long short-term memory network with an attention mechanism is embedded into the GAN. The federated learning framework avoids privacy leakage of hydrological telemetry data during the training process. Experimental results based on four real hydrological telemetry devices demonstrate that the FedLGAN model can achieve anomaly detection and repair while preserving privacy.

Keywords: Anomaly detection; Data repair; Federated learning; Generative adversarial network; Long short-term memory networks.

Grants and funding

This work were supported by the National Natural Science Foundation of China under Grant 62072409, by the Zhejiang Provincial Natural Science Foundation under Grant LR21F020003, and by the R&D Program of of Zhejiang Provincial Department of Water Resources under Grant RB2216. There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.