A selective sampling approach to active feature selection
作者:
摘要
Feature selection, as a preprocessing step to machine learning, has been very effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving result comprehensibility. Traditional feature selection methods resort to random sampling in dealing with data sets with a huge number of instances. In this paper, we introduce the concept of active feature selection, and investigate a selective sampling approach to active feature selection in a filter model setting. We present a formalism of selective sampling based on data variance, and apply it to a widely used feature selection algorithm Relief. Further, we show how it realizes active feature selection and reduces the required number of training instances to achieve time savings without performance deterioration. We design objective evaluation measures of performance, conduct extensive experiments using both synthetic and benchmark data sets, and observe consistent and significant improvement. We suggest some further work based on our study and experiments.
论文关键词:Dimensionality reduction,Feature selection and ranking,Sampling,Learning
论文评审过程:Received 3 June 2003, Accepted 13 May 2004, Available online 7 August 2004.
论文官网地址:https://doi.org/10.1016/j.artint.2004.05.009