An adaptive and general model for label noise detection using relative probabilistic density

作者:

Highlights:

摘要

We present a model, called relative probability density (RPD), to detect label noise by utilizing the contrasting characteristics in different classes. RPD has a natural ratio structure so that a powerful measurement, the Kullback–Leibler Importance Estimation Procedure (KLIEP), can be directly applied for its calculation instead of calculating the probability density in the numerator and denominator separately. In addition, the RPD model can be reduced to a new form that contains only and can be calculated with only a probabilistic classifier and without relying on any other specific measurements, specific loss functions, noise estimation or other extra parameters. Furthermore, an RPD-based filter learning framework, which can adaptively optimize the threshold to accurately identify label noise, is proposed. The experimental results on synthetic and real data sets demonstrate that the RPD-based filter learning framework is more effective than some representative methods. The superior generality and adaptiveness, in addition to the simple design, make it a good replacement for traditional probabilistic classifiers on label-noisy data.

论文关键词:Label noise,Classification,Relative probabilistic density

论文评审过程:Received 16 July 2021, Revised 10 November 2021, Accepted 5 December 2021, Available online 10 December 2021, Version of Record 6 January 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2021.107907