AdaDT: An adaptive decision tree for addressing local class imbalance based on multiple split criteria

作者:Jianjian Yan, Zhongnan Zhang, Huailin Dong

摘要

As it is well known, decision tree is a kind of data-driven classification model, and its primary core is the split criterion. Although a great deal of split criteria have been proposed so far, almost all of them focus on the global class distribution of the training data. However, they ignored the local class imbalance problem that commonly appears during the decision tree induction over balanced or roughly balanced binary class data sets. In the present study, this problem is investigated in detail and an adaptive approach based on multiple existing split criteria is proposed. In the proposed scheme, the local class imbalanced ratio is considered as the weight factor to weigh the importance between these split criteria so as to determine the optimal splitting point at each internal node. In order to evaluate the effectiveness of the proposed method, it is applied on twenty roughly balanced real-world binary class data sets. Experimental results show that the proposed method not only outperforms all other methods, but also improves the prediction accuracy of each class.

论文关键词:Decision tree, Local class imbalance, Multiple split criteria, Binary classification problem

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-020-02061-z