Two Class Pruned Log Message Anomaly Detection

SN Comput Sci. 2021;2(5):391. doi: 10.1007/s42979-021-00772-9. Epub 2021 Jul 24.

Abstract

Log messages are widely used in cloud servers and other systems. Millions of logs are generated each day which makes them important for anomaly detection. However, they are complex unstructured text messages which makes this task difficult. In this paper, a hybrid log message anomaly detection technique is proposed which employs pruning of positive and negative logs. Reliable positive log messages are first selected using a Gaussian mixture model algorithm. Then reliable negative logs are selected using the K-means, Gaussian mixture model and Dirichlet process Gaussian mixture model methods iteratively. It is shown that the precision for positive and negative logs with pruning is high. Anomaly detection is done using a deep learning long short-term memory network. The proposed model is evaluated using the well-known BGL, Openstack, and Thunderbird data sets. The results obtained indicate that the proposed model performs better than several well-known algorithms.

Keywords: Anomaly detection; Deep learning; Hybrid learning; Log messages.