SiaLog: detecting anomalies in software execution logs using the siamese network

作者：Shayan Hashemi, Mika Mäntylä

摘要

Detecting anomalies in software logs has become a notable concern for software engineers and maintainers as they represent anomalies in software execution paths and states. This paper propose a novel anomaly detection approach based on the Siamese network on top of Recurrent Neural Networks(RNN). Accordingly, we introduce a novel training pair generation algorithm to train the Siamese network which reduces generated training significantly while maintaining the \(F_1\) score. Additionally, we propose a hybrid model by combining the Siamese network with a traditional feedforward neural network to make end-to-end training possible, reducing engineering effort in setting up a deep-learning-based log anomaly detector. Furthermore, we provides validations of the approach on the Hadoop Distributed File System (HDFS), Blue Gene/L (BGL), and Hadoop map-reduce task log datasets. To the best of our knowledge, the proposed approach outperforms other methods on the same dataset at the \(F_1\) scores of respectively 0.99, 0.99, and 0.94 on HDFS, BGL, and Hadoop datasets, resulting in a new state-of-the-art performance.To further evaluate the proposed method, we examine our method’s robustness to log evolutions by evaluating the model on synthetically evolved log sequences; we got the \(F_1\) score of 0.95 on the HDFS dataset at the noise ratio of \(20\%\). Finally, we dive deep into some of the side benefits of the Siamese network. Accordingly, we introduce an unsupervised log evolution monitoring method alongside a visualization technique that facilitates model interpretability.

论文关键词：Log analysis, Anomaly detection, Siamese network, Deep learning

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10515-022-00365-7