Corporate default forecasting with machine learning

作者:

Highlights:

• Machine learning models entail gains in accuracy relative to statistical models.

• Due to non-linear relationship between some of the variables and the default outcome.

• The gains are greater for clusters of borrowers more difficult to predict.

• The gains decrease when high quality information is used in the training dataset.

• Credit allocation based on ML models would imply lower default rates.

摘要

•Machine learning models entail gains in accuracy relative to statistical models.•Due to non-linear relationship between some of the variables and the default outcome.•The gains are greater for clusters of borrowers more difficult to predict.•The gains decrease when high quality information is used in the training dataset.•Credit allocation based on ML models would imply lower default rates.

论文关键词:Credit scoring,Machine learning,Random forest,Gradient boosting machine

论文评审过程:Received 24 July 2019, Revised 12 March 2020, Accepted 13 May 2020, Available online 27 May 2020, Version of Record 14 July 2020.

论文官网地址:https://doi.org/10.1016/j.eswa.2020.113567