Feature selection with dynamic mutual information

作者:

Highlights:

摘要

Feature selection plays an important role in data mining and pattern recognition, especially for large scale data. During past years, various metrics have been proposed to measure the relevance between different features. Since mutual information is nonlinear and can effectively represent the dependencies of features, it is one of widely used measurements in feature selection. Just owing to these, many promising feature selection algorithms based on mutual information with different parameters have been developed. In this paper, at first a general criterion function about mutual information in feature selector is introduced, which can bring most information measurements in previous algorithms together. In traditional selectors, mutual information is estimated on the whole sampling space. This, however, cannot exactly represent the relevance among features. To cope with this problem, the second purpose of this paper is to propose a new feature selection algorithm based on dynamic mutual information, which is only estimated on unlabeled instances. To verify the effectiveness of our method, several experiments are carried out on sixteen UCI datasets using four typical classifiers. The experimental results indicate that our algorithm achieved better results than other methods in most cases.

论文关键词:Classification,Feature selection,Mutual information,Filter method

论文评审过程:Received 16 December 2007, Revised 15 October 2008, Accepted 22 October 2008, Available online 8 November 2008.

论文官网地址:https://doi.org/10.1016/j.patcog.2008.10.028