R2CI: Information theoretic-guided feature selection with multiple correlations

作者:

Highlights:

• A number of information-theoretic based feature selection approaches (ITFSs) are reviewed in terms of feature correlation. For comparison purposes, the correlation they focus on and their objective evaluation functions are listed.

• This paper is the first work to comprehensively investigate and define multiple correlations, including relevance, redundancy, complementarity, and interaction in the feature selection process within the framework of information theory. The distinctions and connections between multiple correlations are explored in depth at four different levels.

• A new feature selection approach that combines multiple correlations of features is constructed and corresponding feature selection algorithm with class-based relevance, redundancy, com-plementarity, and interaction (R2CI) is designed.

• The results of comparisons and hypothesis testing against eleven related feature selection algorithms on twenty datasets show that the proposed algorithm has significant advantages in most cases.

摘要

•A number of information-theoretic based feature selection approaches (ITFSs) are reviewed in terms of feature correlation. For comparison purposes, the correlation they focus on and their objective evaluation functions are listed.•This paper is the first work to comprehensively investigate and define multiple correlations, including relevance, redundancy, complementarity, and interaction in the feature selection process within the framework of information theory. The distinctions and connections between multiple correlations are explored in depth at four different levels.•A new feature selection approach that combines multiple correlations of features is constructed and corresponding feature selection algorithm with class-based relevance, redundancy, com-plementarity, and interaction (R2CI) is designed.•The results of comparisons and hypothesis testing against eleven related feature selection algorithms on twenty datasets show that the proposed algorithm has significant advantages in most cases.

论文关键词:Feature selection,Information theory,Relevance,Redundancy,Complementarity,Interaction

论文评审过程:Received 2 November 2021, Revised 6 February 2022, Accepted 21 February 2022, Available online 23 February 2022, Version of Record 2 March 2022.

论文官网地址:https://doi.org/10.1016/j.patcog.2022.108603