Supervised feature subset selection with ordinal optimization
作者:
Highlights:
•
摘要
The ultimate goal of supervised feature selection is to select a feature subset such that the prediction accuracy is maximized. Most of the existing methods, such as filter and embedded models, can be viewed as using approximate objectives which are different from the prediction accuracy. The wrapper models maximize the prediction accuracy directly, but the optimization has very high computational complexity. To address the limitations, we present an ordinal optimization perspective for feature selection (OOFS). Feature subset evaluation is formulated as a system simulation process with randomness. Supervised feature selection becomes maximizing the expected performance of the system, where ordinal optimization can be applied to identify a set of order-good-enough solutions with much reduced complexity and parallel computing. These solutions correspond to the really good enough (value-good-enough) solutions when the solution space structure, characterized by ordered performance curve (OPC), exhibits concave shapes. We analyze that this happens in some important applications such as image classification, where a large number features have relatively similar abilities in discrimination. We further improve the OOFS method with a feature scoring algorithm, called OOFSs. We prove that, when the performance difference of solutions increases monotonically with respect to the solution difference, the expectation of the scores reflects useful information for estimating the globally optimal solution. Experimental results in sixteen real-world datasets show that our method provides a good trade-off between prediction accuracy and computational complexity.
论文关键词:Supervised feature selection,Uniform sampling,Ordinal optimization,Ordered performance curve,Feature scoring,Order-good-enough,Value-good enough
论文评审过程:Received 12 January 2013, Revised 5 November 2013, Accepted 7 November 2013, Available online 28 November 2013.
论文官网地址:https://doi.org/10.1016/j.knosys.2013.11.004