Predicting creditworthiness in retail banking with limited scoring data

作者：

Highlights：

•

摘要

The preoccupation with modelling credit scoring systems including their relevance to predicting and decision making in the financial sector has been with developed countries, whilst developing countries have been largely neglected. The focus of our investigation is on the Cameroonian banking sector with implications for fellow members of the Banque des Etats de L'Afrique Centrale (BEAC) family which apply the same system. We apply logistic regression (LR), Classification and Regression Tree (CART) and Cascade Correlation Neural Network (CCNN) in building our knowledge-based scoring models. To compare various models’ performances, we use ROC curves and Gini coefficients as evaluation criteria and the Kolmogorov-Smirnov curve as a robustness test. The results demonstrate that an improvement in terms of predicting power from 15.69% default cases under the current system, to 7.68% based on the best scoring model, namely CCNN can be achieved. The predictive capabilities of all models are rated as at least very good using the Gini coefficient; and rated excellent using the ROC curve for CCNN. Our robustness test confirmed these results. It should be emphasised that in terms of prediction rate, CCNN is superior to the other techniques investigated in this paper. Also, a sensitivity analysis of the variables identifies previous occupation, borrower's account functioning, guarantees, other loans and monthly expenses as key variables in the forecasting and decision making processes which are at the heart of overall credit policy.

论文关键词：Predicting creditworthiness,Credit scoring,Cascade correlation neural networks,CART,Limited data,E50,G21,C45

论文评审过程：Received 11 April 2015, Revised 21 March 2016, Accepted 25 March 2016, Available online 12 April 2016, Version of Record 5 May 2016.

论文官网地址：https://doi.org/10.1016/j.knosys.2016.03.023