A novel XGBoost extension for credit scoring class-imbalanced data combining a generalized extreme value link and a modified focal loss function

作者:

Highlights:

• We propose several XGBoost extensions to learn class-imbalanced data.

• Fit the techniques to a vast, dynamic and public Freddie Mac mortgages data.

• Assess model performance relative to data complexity and external validity.

• Gauge the business value of the techniques using a profitability measure.

• A proposed GEV link with a modified focal loss in XGBoost is best for rare/outlier cases.

摘要

•We propose several XGBoost extensions to learn class-imbalanced data.•Fit the techniques to a vast, dynamic and public Freddie Mac mortgages data.•Assess model performance relative to data complexity and external validity.•Gauge the business value of the techniques using a profitability measure.•A proposed GEV link with a modified focal loss in XGBoost is best for rare/outlier cases.

论文关键词:Class imbalance,Machine learning,Credit scoring,XGBoost,Freddie Mac

论文评审过程:Received 20 March 2021, Revised 20 March 2022, Accepted 10 April 2022, Available online 13 April 2022, Version of Record 26 April 2022.

论文官网地址:https://doi.org/10.1016/j.eswa.2022.117233