Rough set and scatter search metaheuristic based feature selection for credit scoring

作者:

Highlights:

摘要

As the credit industry has been growing rapidly, credit scoring models have been widely used by the financial industry during this time to improve cash flow and credit collections. However, a large amount of redundant information and features are involved in the credit dataset, which leads to lower accuracy and higher complexity of the credit scoring model. So, effective feature selection methods are necessary for credit dataset with huge number of features. In this paper, a novel approach, called RSFS, to feature selection based on rough set and scatter search is proposed. In RSFS, conditional entropy is regarded as the heuristic to search the optimal solutions. Two credit datasets in UCI database are selected to demonstrate the competitive performance of RSFS consisted in three credit models including neural network model, J48 decision tree and Logistic regression. The experimental result shows that RSFS has a superior performance in saving the computational costs and improving classification accuracy compared with the base classification methods.

论文关键词:Credit scoring,Feature selection,Rough set,Scatter search,Meta-heuristics

论文评审过程:Available online 12 November 2011.

论文官网地址:https://doi.org/10.1016/j.eswa.2011.11.011