A dynamic financial distress forecast model with multiple forecast results under unbalanced data environment

作者:

Highlights:

摘要

Corporate financial distress forecasts are important for companies, investors and regulatory authorities. However, as most financial distress forecast (FDF) models in previous studies were based on a single time dimension, they have tended to ignore the two key financial distress data characteristics, imbalanced data sets and concept drift of data stream. To overcome these problems, this study proposes a new dynamic financial distress forecasting (DFDF) approach, the Adaptive Neighbor SMOTE-Recursive Ensemble Approach (ANS-REA), that allows for multiple forecast results from unbalanced data streams. An empirical experiment was conducted on 373 financially distressed samples and 1119 matching normal Chinese listed companies from 2007 to 2017. With an overall average AUC, it was found that the Random Forest (RF) classifier outperformed other commonly used classifiers such as Support Vector Machine (SVM), Decision Tree (DT), baggingDT, oblique random forests (obRF), Kernel ridge regression (KRR) and Bayes in the classification of DFDF data. In addition, the proposed ANS-REA algorithm had better performance than SMOTE, ANS, Random Walk Over-Sampling Approach (RWO), Rapidly Converging Gibbs sampling Technique (racog), SMOTEboost, RUSboost, SMOTEbagging, wRACOG and Majority Weighted Minority Oversampling Technique (MWMOTE) methods in dealing with imbalanced data sets classification. Further, we found that the proposed model that combined the multiple forecast results is the effective way to solve the financial distress forecast problem.

论文关键词:Adaptive neighbor SMOTE,Chinese listed companies,Financial distress forecast,Random forest,Recursive ensemble approach

论文评审过程:Received 23 January 2019, Revised 13 November 2019, Accepted 8 December 2019, Available online 16 December 2019, Version of Record 24 February 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2019.105365