Construction of EBRB classifier for imbalanced data based on Fuzzy C-Means clustering
作者:
Highlights:
•
摘要
The Extended Belief Rule-Based (EBRB) system has been widely used to solve the real-world problems concerning with incompleteness, uncertainty, and ambiguity. However, EBRB is essentially a data-driven method, in which each rule is obtained from training data. Therefore, the generated extended belief rules may be severely biased when dealing with data with imbalanced classes. In this case, the number of the rules generated by the samples of majority classes (i.e., negative samples) may be much larger than those of minority classes (i.e., positive samples). Thus, the class imbalance may lead to significant biases in system decision-making. In order to resolve this problem, this paper proposes a novel EBRB system based on fuzzy C-means clustering (FCM-EBRB). First, we adopt FCM clustering to oversample the positive samples and undersample the negative ones, so as to achieve the balance between them. Next, this paper improves the construction method of EBRB and optimizes the system through an efficient parameter learning strategy. Finally, this paper conducts comprehensive comparison experiments on a binary classification synthetic dataset and 11 commonly used KEEL public class imbalance datasets. Experimental results show that the proposed method can effectively reduce the scale of the rule base and achieve high inference accuracy, especially for imbalanced data.
论文关键词:Fuzzy C-means clustering,Extended Belief Rule-Based system,Imbalanced classification method,Information gain ratio,Parameter learning
论文评审过程:Received 11 May 2021, Revised 6 October 2021, Accepted 7 October 2021, Available online 12 October 2021, Version of Record 22 October 2021.
论文官网地址:https://doi.org/10.1016/j.knosys.2021.107590