Probabilistic and discriminative group-wise feature selection methods for credit risk analysis

作者:

Highlights:

摘要

Many financial organizations such as banks and retailers use computational credit risk analysis (CRA) tools heavily due to recent financial crises and more strict regulations. This strategy enables them to manage their financial and operational risks within the pool of financial institutes. Machine learning algorithms especially binary classifiers are very popular for that purpose. In real-life applications such as CRA, feature selection algorithms are used to decrease data acquisition cost and to increase interpretability of the decision process. Using feature selection methods directly on CRA data sets may not help due to categorical variables such as marital status. Such features are usually are converted into binary features using 1-of-k encoding and eliminating a subset of features from a group does not help in terms of data collection cost or interpretability. In this study, we propose to use the probit classifier with a proper prior structure and multiple kernel learning with a proper kernel construction procedure to perform group-wise feature selection (i.e., eliminating a group of features together if they are not helpful). Experiments on two standard CRA data sets show the validity and effectiveness of the proposed binary classification algorithm variants.

论文关键词:Credit risk analysis,Feature selection,Probit classifier,Multiple kernel learning,Sparsity

论文评审过程:Available online 27 April 2012.

论文官网地址:https://doi.org/10.1016/j.eswa.2012.04.050