Bankruptcy prediction for SMEs using transactional data and two-stage multiobjective feature selection

作者:

Highlights:

• A bankruptcy prediction model for SMEs that uses transactional data under a scenario where no accounting data are required.

• Offline and online test results both confirm that transactional data–based variables improve SME bankruptcy prediction.

• A two-stage multiobjective feature-selection method and compare it with other benchmark methods.

摘要

Many bankruptcy prediction models for small and medium-sized enterprises (SMEs) are built using accounting-based financial ratios. This study proposes a bankruptcy prediction model for SMEs that uses transactional data and payment network–based variables under a scenario where no financial (accounting) data are required. Offline and online test results both confirmed the predictive capability and economic benefit of transactional data–based variables. However, incorporating those features in predictive models produces high dimensional problems, which deteriorates model interpretability and increases feature acquisition costs. Thus, we propose a two-stage multiobjective feature-selection method that optimizes the number of features as well as model classification performance. The results showed that the proposed model achieved similar classification performance while greatly reducing the cardinality of the feature subset. Finally, the feature importance evaluation for features in the optimal subset confirmed the importance of transactional data and payment network-based variables for bankruptcy prediction.

论文关键词:Bankruptcy prediction,Payment and transactional data,Expected maximum profit,Data imbalance,Feature selection

论文评审过程:Received 23 April 2020, Revised 20 October 2020, Accepted 21 October 2020, Available online 3 November 2020, Version of Record 30 November 2020.

论文官网地址:https://doi.org/10.1016/j.dss.2020.113429