Layer-constrained variational autoencoding kernel density estimation model for anomaly detection

作者：

Highlights：

•

摘要

Unsupervised techniques typically rely on the probability density distribution of the data to detect anomalies, where objects with low probability density are considered to be abnormal. However, modeling the density distribution of high dimensional data is known to be hard, making the problem of detecting anomalies from high-dimensional data challenging. The state-of-the-art methods solve this problem by first applying dimension reduction techniques to the data and then detecting anomalies in the low dimensional space. Unfortunately, the low dimensional space does not necessarily preserve the density distribution of the original high dimensional data. This jeopardizes the effectiveness of anomaly detection. In this work, we propose a novel high dimensional anomaly detection method called LAKE. The key idea of LAKE is to unify the representation learning capacity of layer-constrained variational autoencoder with the density estimation power of kernel density estimation (KDE). Then a probability density distribution of the high dimensional data can be learned, which is able to effectively separate the anomalies out. LAKE successfully consolidates the merits of the two worlds, namely layer-constrained variational autoencoder and KDE by using a probability density-aware strategy in the training process of the autoencoder. Extensive experiments on six public benchmark datasets demonstrate that our method significantly outperforms the state-of-the-art methods in detecting anomalies and achieves up to 37% improvement in score.

论文关键词：Anomaly detection,Variational autoencoder,Kernel density estimation,Layer constraint,Deep learning

论文评审过程：Received 16 October 2019, Revised 5 March 2020, Accepted 7 March 2020, Available online 10 March 2020, Version of Record 16 April 2020.

论文官网地址：https://doi.org/10.1016/j.knosys.2020.105753