Feature selection with Symmetrical Complementary Coefficient for quantifying feature interactions

作者:Rui Zhang, Zuoquan Zhang

摘要

In the field of machine learning and data mining, feature interaction is a ubiquitous issue that cannot be ignored and has attracted more attention in recent years. In this paper, we proposed the Symmetrical Complementary Coefficient which can quantify feature interactions very well. Based on it, we improved the Sequential Forward Selection (SFS) algorithm and proposed a new feature subset searching algorithm called SCom-SFS which only needs to consider the feature interactions between adjacent features on a given sequence instead of all of them. Moreover, discovered feature interactions can speed up the process of searching for the optimal feature subset. In addition, we have improved the ReliefF algorithm by screening out representative samples from the original data set, and need not to sample the samples. The improved ReliefF algorithm has been proved to be more efficient and reliable. An effective and complete feature selection algorithm RRSS is obtained through the combination of the two modified algorithms. According to the experimental results, the proposed algorithm RRSS outperformed five classic and two latest feature selection algorithms in terms of size of resulting feature subset, Accuracy, Kappa coefficient, and adjusted Mean-Square Error (MSE).

论文关键词:Feature selection, ReliefF, Sequential Forward Selection, Feature interaction, Random Forest, Symmetrical Complementary Coefficient

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-019-01518-0