Consumer credit scoring models with limited data

作者：

Highlights：

•

摘要

In this paper we design the neural network consumer credit scoring models for financial institutions where data usually used in previous research are not available. We use extensive primarily accounting data set on transactions and account balances of clients available in each financial institution. As many of these numerous variables are correlated and have very questionable information content, we considered the issue of variable selection and the selection of training and testing sub-sets crucial in developing efficient scoring models. We used a genetic algorithm for variable selection. In dividing performing and nonperforming loans into training and testing sub-sets we replicated the distribution on Kohonen artificial neural network, however, when evaluating the efficiency of models, we used k-fold cross-validation. We developed consumer credit scoring models with error back-propagation artificial neural networks and checked their efficiency against models developed with logistic regression. Considering the dataset of questionable information content, the results were surprisingly good and one of the error back-propagation artificial neural network models has shown the best results. We showed that our variable selection method is well suited for the addressed problem.

论文关键词：Consumer credit scoring,Neural networks,Genetic algorithm,Principle component analysis,Variable selection

论文评审过程：Available online 14 June 2008.

论文官网地址：https://doi.org/10.1016/j.eswa.2008.06.016