Supervised feature selection by clustering using conditional mutual information-based distances

作者:

Highlights:

摘要

In this paper, a supervised feature selection approach is presented, which is based on metric applied on continuous and discrete data representations. This method builds a dissimilarity space using information theoretic measures, in particular conditional mutual information between features with respect to a relevant variable that represents the class labels. Applying a hierarchical clustering, the algorithm searches for a compression of the information contained in the original set of features. The proposed technique is compared with other state of art methods also based on information measures. Eventually, several experiments are presented to show the effectiveness of the features selected from the point of view of classification accuracy.

论文关键词:Supervised feature selection,Clustering,Conditional mutual information

论文评审过程:Received 19 June 2009, Revised 5 December 2009, Accepted 17 December 2009, Available online 23 December 2009.

论文官网地址:https://doi.org/10.1016/j.patcog.2009.12.013