Feature selection for multi-label classification by maximizing full-dimensional conditional mutual information

作者:Zhi-Chao Sha, Zhang-Meng Liu, Chen Ma, Jun Chen

摘要

Conditional mutual information (CMI) maximization is a promising criterion for feature selection in a computationally efficient stepwise way, but it is hard to be applied comprehensively because of imprecise probability calculation and heavy computational load. Many dimension-reduced CMI-based and mutual information (MI)-based methods have been reported to achieve state-of-art performances in terms of classification. However, model deviations are introduced into the CMI and MI formulations in these methods during dimension reduction. In this paper, we start with the full-dimensional CMI to deal with the feature selection problem, so as to retain full inter-feature and feature-label mutual information when selecting new features. The cost function is approximated and simplified from a mathematical perspective to overcome the difficulties for maximizing the original full-dimensional CMI. A relationship is established between the proposed feature selection criterion and the one based on Hilbert-Schmidt independence, which explains qualitatively how the new criterion succeeds to achieve relevance maximization and redundance minimization simultaneously. Experiments on real-world datasets demonstrate the predominance of the proposed method over the existing ones.

论文关键词:Feature selection, Classification, Conditional mutual information, Relevance maximization, Redundance minimization, Hilbert-Schmidt independence criterion (HSIC)

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-020-01822-0