GRAND: GAN-based software runtime anomaly detection method using trace information

Neural Netw. 2024 Jan:169:365-377. doi: 10.1016/j.neunet.2023.10.036. Epub 2023 Oct 29.

Abstract

Software runtime anomaly detection can detect manifestations (known as anomalies) caused by faults in complex systems before they lead to failure. Whereas most existing methods use external performance indicators, this study uses internal execution traces to reveal failures not only related to software performance issues but also functional errors. A neural network model called GRAND, which combines a variational autoencoder and a generative adversarial network, is proposed to mine anomalies in the execution trace. Cassandra, a widely used database system, was used as a representation to conduct the empirical study. The dataset was collected under a well-designed operational profile that contained 5180 time series, each containing more than ten million data points. GRAND achieved a higher detection performance than the other two SOTA baseline models, with a 99% F1-score compared with 93% and 87%. Ablation studies show that the workload information used in GRAND can determine whether the current internal status is consistent with the task, thus achieving a 16% improvement in the F1-score. The attention mechanism used for data fusion can achieve a 32% improvement in the F1-score.

Keywords: GAN; Internal execution trace; Software anomaly detection; Unsupervised learning; VAE.

MeSH terms

  • Databases, Factual
  • Empirical Research
  • Neural Networks, Computer*
  • Software*
  • Time Factors