Invariant optimal feature selection: A distance discriminant and feature ranking based solution

作者:

Highlights:

摘要

The goal of feature selection is to find the optimal subset consisting of m features chosen from the total n features. One critical problem for many feature selection methods is that an exhaustive search strategy has to be applied to seek the best subset among all the possible nm feature subsets, which usually results in a considerably high computational complexity. The alternative suboptimal feature selection methods provide more practical solutions in terms of computational complexity but they cannot promise that the finally selected feature subset is globally optimal. We propose a new feature selection algorithm based on a distance discriminant (FSDD), which not only solves the problem of the high computational costs but also overcomes the drawbacks of the suboptimal methods. The proposed method is able to find the optimal feature subset without exhaustive search or Branch and Bound algorithm. The most difficult problem for optimal feature selection, the search problem, is converted into a feature ranking problem following rigorous theoretical proof such that the computational complexity can be greatly reduced. The proposed method is invariant to the linear transformation of data when a diagonal transformation matrix is applied. FSDD was compared with ReliefF and mrmrMID based on mutual information on 8 data sets. The experiment results show that FSDD outperforms the other two methods and is highly efficient.

论文关键词:Optimal feature selection,Distance discriminant,Feature ranking

论文评审过程:Received 4 February 2007, Revised 28 August 2007, Accepted 10 October 2007, Available online 22 October 2007.

论文官网地址:https://doi.org/10.1016/j.patcog.2007.10.018