Hybrid system with genetic algorithm and artificial neural networks and its application to retail credit risk assessment

作者:

Highlights:

摘要

The databases of the banks around the world have accumulated large quantities of information about clients and their financial and payment history. These databases can be used for the credit risk assessment, but they are commonly high dimensional. Irrelevant features in a training dataset may produce less accurate results of classification analysis. Data preprocessing is required to prepare the data for classification to increase the predictive accuracy. Feature selection is a preprocessing technique commonly used on high dimensional data and its purposes include reducing dimensionality, removing irrelevant and redundant features, facilitating data understanding, reducing the amount of data needed for learning, improving predictive accuracy of algorithms, and increasing interpretability of models. In this paper we investigate the extent to which the total data, owned by a bank, can be a good basis for predicting the borrower’s ability to repay the loan on time. We propose a feature selection technique for finding an optimum feature subset that enhances the classification accuracy of neural network classifiers. Experiments were conducted on the credit dataset collected at a Croatian bank to assess the accuracy of our technique. We found that the hybrid system with genetic algorithm is competitive and can be used as feature selection technique to discover the most significant features in determining risk of default.

论文关键词:Classification,Credit scoring,Neural network,Genetic algorithm,Feature selection

论文评审过程:Available online 9 May 2012.

论文官网地址:https://doi.org/10.1016/j.eswa.2012.05.023