Conditional Wasserstein GAN-based oversampling of tabular data for imbalanced learning
作者:
Highlights:
• We design a tabular data GAN for oversampling that can handle categorical variables.
• We assess our GAN in a credit scoring setting using multiple real-world datasets.
• We find GAN-based oversampling to outperform advanced SMOTE-type benchmarks.
• Ablations confirm the specific choices in the proposed GAN architecture.
摘要
•We design a tabular data GAN for oversampling that can handle categorical variables.•We assess our GAN in a credit scoring setting using multiple real-world datasets.•We find GAN-based oversampling to outperform advanced SMOTE-type benchmarks.•Ablations confirm the specific choices in the proposed GAN architecture.
论文关键词:Imbalanced learning,Generative adversarial networks,Credit scoring,Oversampling
论文评审过程:Received 30 September 2020, Revised 14 December 2020, Accepted 5 January 2021, Available online 13 January 2021, Version of Record 4 April 2021.
论文官网地址:https://doi.org/10.1016/j.eswa.2021.114582