Gaussian prior based adaptive synthetic sampling with non-linear sample space for imbalanced learning

作者:

Highlights:

摘要

In the presence of skewed category distribution, most learning algorithms fail to provide favorable performance on the representation about data characteristics. Thus learning from imbalanced data is a crucial challenge in the field of data engineering and knowledge discovery. In this work, we proposed an imbalanced learning method to generate minority samples for the compensation of class distribution skews. Different from existing synthetic over-sampling techniques, the data generation is conducted within the hyperplane rather than on the hyperline, thus the proposed method breaks down the ties imposed by the linear interpolation. In addition, this proposed method minimizes the sampling uncertain and risk by integrating a prior knowledge about the minority class instances. Moreover, a multi-objective optimization combined with error bound model develops this proposed method into an adaptive imbalanced learning. Extensive experiments have been performed on imbalanced issues, and the experimental results demonstrate that this method can improve the performance of different classification algorithms.

论文关键词:Imbalanced learning,Error bound model,Adaptive method,Classification algorithm,Gaussian mixture model

论文评审过程:Received 30 June 2019, Revised 26 September 2019, Accepted 11 November 2019, Available online 18 November 2019, Version of Record 8 February 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2019.105231