Mean-shift outlier detection and filtering
作者:
Highlights:
•
摘要
Traditional outlier detection methods create a model for data and then label as outliers for objects that deviate significantly from this model. However, when dat has many outliers, outliers also pollute the model. The model then becomes unreliable, thus rendering most outlier detectors to become ineffective. To solve this problem, we propose a mean-shift outlier detector. This detector employs a mean-shift technique to modify data and cancel the bias caused by the outliers. The mean-shift technique replaces every object by the mean of its k-nearest neighbors which essentially removes the effect of outliers before clustering without the need to know the outliers. In addition, it also detects outliers based on the distance shifted. Our experiments show that the proposed method works well regardless of the number of outliers in the data. This method outperforms all state-of-the-art methods tested, with both real-world numeric datasets as well as generated numeric and string datasets.
论文关键词:Outlier detection,Anomaly detection,Mean-shift,Medoid-shift,Clustering,Noise filtering,Outlier filtering
论文评审过程:Received 13 May 2020, Revised 19 January 2021, Accepted 1 February 2021, Available online 8 February 2021, Version of Record 21 February 2021.
论文官网地址:https://doi.org/10.1016/j.patcog.2021.107874