Utilizing the advantages of both global and local search strategies for finding a small subset of features in a two-stage method
作者:Mohammad Masoud Javidi, Fatemeh Zarisfi Kermani
摘要
Feature selection (FS) is one of the pre-processing methods that are widely used in the fields of Data Mining and Pattern Recognition. Elimination of redundant/irrelevant features of large data sets and finding a suitable feature subset are one of the main goals in FS. The utilization of evolutionary algorithms, including global search algorithms e.g. Genetic Algorithm and local search algorithms e.g. hill climbing, is known as the best way to solve a variety of optimization problems such as FS problem. They are never able to find a globally optimal solution because they are often trapped in one of the local optimum solutions and stop. Therefore, the researchers have tried to solve this major problem by escaping from the local solutions. In this article, we propose a two-stage method by applying a global search algorithm and a local search algorithm to find a sub-optimal solution for the FS problem. Here, we define a sub-optimal solution as a solution with the high reduction rate and the similar or even better classification performance. In the suggested two-stage method referred to as BGSA-SA, that is, the binary version of the Gravitational Search Algorithm (BGSA) and Simulated Annealing (SA) are selected as global and local search algorithms, respectively. For evaluating this proposed two-stage method, we utilized several UCI machine learning datasets and both classifiers SVM and K-NN. We compare the accuracy and reduction rate of the proposed two-stage method with three groups of methods, such as: (1) six singular meta-heuristic methods including BGA, BPSO, GSAPSO, CHGSA, BGSA, and SA, (2) the other two-stage methods namely BGA-SA and BPSO-SA, and (3) seven published methods as the state-of-art methods. The obtained results confirm that our BGSA-SA method has the rank 1 in the reduction rate whereas the accuracy of it using both SVM and K-NN classifiers is similar or even, in some cases, better than the other mentioned methods.
论文关键词:Data mining, Feature selection, Gravitational search algorithm, Simulated annealing
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10489-018-1159-5