Outlier detection on uncertain data based on local information

作者:

Highlights:

摘要

Based on local information: local density and local uncertainty level, a new outlier detection algorithm is designed in this paper to calculate uncertain local outlier factor (ULOF) for each point in an uncertain dataset. In this algorithm, all concepts, definitions and formulations for conventional local outlier detection approach (LOF) are generalized to include uncertainty information. The least squares algorithm on multi-times curve fitting is used to generate an approximate probability density function of distance between two points. An iteration algorithm is proposed to evaluate K–η–distance and a pruning strategy is adopted to reduce the size of candidate set of nearest-neighbors. The comparison between ULOF algorithm and the state-of-the-art approaches has been made. Results of several experiments on synthetic and real data sets demonstrate the effectiveness of the proposed approach.

论文关键词:Outlier detection,Uncertain data,Local information

论文评审过程:Received 9 January 2013, Revised 28 May 2013, Accepted 11 July 2013, Available online 20 July 2013.

论文官网地址:https://doi.org/10.1016/j.knosys.2013.07.005