Oversampling technique based on fuzzy representativeness difference for classifying imbalanced data

作者:Ruonan Ren, Youlong Yang, Liqin Sun

摘要

Class imbalance problem poses a difficulty to learning algorithms in pattern classification. Oversampling techniques is one of the most widely used techniques to solve these problems, but the majority of them use the sample size ratio as an imbalanced standard. This paper proposes a fuzzy representativeness difference-based oversampling technique, using affinity propagation and the chromosome theory of inheritance (FRDOAC). The fuzzy representativeness difference (FRD) is adopted as a new imbalance metric, which focuses on the importance of samples rather than the number. FRDOAC firstly finds the representative samples of each class according to affinity propagation. Secondly, fuzzy representativeness of every sample is calculated by the Mahalanobis distance. Finally, synthetic positive samples are generated by the chromosome theory of inheritance until the fuzzy representativeness difference of two classes is small. A thorough experimental study on 16 benchmark datasets was performed and the results show that our method is better than other advanced imbalanced classification algorithms in terms of various evaluation metrics.

论文关键词:Class imbalanced problem, Oversampling technique, Affinity propagation, Chromosome theory of inheritance, Fuzzy representativeness difference

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-020-01644-0