Inferring multi-stage risk for online consumer credit services: An integrated scheme using data augmentation and model enhancement
作者:
Highlights:
• A decision support system, i.e., a socio-technical artifact is developed to extend known solutions to new problems.
• To tackle the “thin file” challenge, we augmented the data by incorporating additional pieces of information about consumers from phone usage behaviors.
• For the challenge of properly setting the repayment timing partition, a multi-stage credit risk prediction module is adopted using ordinal classification and heterogeneous ensemble method.
• A three-step analysis, including prediction evaluation, model interpretation using Shapley Additive Explanations (SHAP), and welfare analysis, was performed to evaluate our proposed scheme's efficacy.
• The legal and ethical problem of collecting and analyzing consumers' behavior data in the context of online consumer credit service has been carefully considered.
摘要
In recent years, online consumer credit services have emerged in e-commerce. Although such services boost sales, the best way to allocate credit to consumers is a critical issue to be explored. In this paper, a comprehensive scheme is proposed using data augmentation and model enhancement to infer online consumer credit risk. The proposed scheme augments consumer profiles by incorporating phone usage information to alleviate the “thin file” challenge and enhance the predictive model by taking a multi-staged view of consumers' repayment timing to achieve a more finely grained credit risk determination. A three-step analysis, including prediction evaluation, model interpretation using Shapley Additive Explanations (SHAP), and welfare analysis, was performed to evaluate our proposed scheme's efficacy. We found that phone usage information enhanced predictive performance and that underlying psychological mechanisms can be analyzed by corresponding feature interpretations to theories. The follow-up welfare analysis illustrates the business value of the proposed scheme.
论文关键词:Online consumer credit risk,Default,Delinquency,Phone usage data,Machine learning,Model interpretation,Welfare analysis
论文评审过程:Received 2 November 2020, Revised 18 April 2021, Accepted 2 June 2021, Available online 7 June 2021, Version of Record 19 August 2021.
论文官网地址:https://doi.org/10.1016/j.dss.2021.113611