IMCStacking: Cost-sensitive stacking learning with feature inverse mapping for imbalanced problems

作者:

Highlights:

摘要

Stacking related methods develop rapidly recent years. However, few Stacking based ensemble methods are designed for imbalanced problems. In this paper, a novel Feature Inverse Mapping based Cost-sensitive Stacking learning (IMCStacking) is proposed to solve the problems encountered in imbalanced classification. In IMCStacking, we integrate the cost-sensitive Logistic Regression as the final classifier to regard different costs to majority and minority samples. Furthermore, a quick and effective feature inverse mapping technique is applied to IMCStacking to maximize the utilization of the cross-validation process during the Stacking ensemble. This trick can make the proposed method learn better classification thresholds for imbalanced problems. As the result, IMCStacking implements the cost-sensitive strategy on both data level and feature level to overcome the imbalances. Moreover, both linear and forest based approaches work as base classifiers in IMCStacking to guarantee enough generalization. Finally, comprehensive comparison experiments about training times and mean accuracy (M-ACC) on typical imbalanced datasets from KEEL demonstrate both the effectiveness and efficiency of the proposed IMCStacking.

论文关键词:Feature mapping,Cost-sensitive,Decision tree ensemble,Linear classifiers,Stacking,Imbalanced problems

论文评审过程:Received 6 November 2017, Revised 19 January 2018, Accepted 19 February 2018, Available online 24 February 2018, Version of Record 26 May 2018.

论文官网地址:https://doi.org/10.1016/j.knosys.2018.02.031