Deep generative models for reject inference in credit scoring

作者：

Highlights：

•

摘要

Credit scoring models based on accepted applications may be biased and their consequences can have a statistical and economic impact. Reject inference is the process of attempting to infer the creditworthiness status of the rejected applications. Inspired by the promising results of semi-supervised deep generative models, this research develops two novel Bayesian models for reject inference in credit scoring combining Gaussian mixtures and auxiliary variables in a semi-supervised framework with generative models. To the best of our knowledge this is the first study coupling these concepts together. The goal is to improve the classification accuracy in credit scoring models by adding reject applications. Further, our proposed models infer the unknown creditworthiness of the rejected applications by exact enumeration of the two possible outcomes of the loan (default or non-default). The efficient stochastic gradient optimization technique used in deep generative models makes our models suitable for large data sets. Finally, the experiments in this research show that our proposed models perform better than classical and alternative machine learning models for reject inference in credit scoring, and that model performance increases with the amount of data used for model training.

论文关键词：Reject inference,Deep generative models,Credit scoring,Semi-supervised learning

论文评审过程：Received 23 April 2019, Revised 4 February 2020, Accepted 8 March 2020, Available online 12 March 2020, Version of Record 16 April 2020.

论文官网地址：https://doi.org/10.1016/j.knosys.2020.105758