Novel multi-label feature selection via label symmetric uncertainty correlation learning and feature redundancy evaluation

作者:

Highlights:

摘要

Multi-label data with high dimensionality, widely existed in the real world, bring many challenges to the applications of machine learning, pattern recognition and other fields. Scholars have proposed some multi-label feature selection methods from various aspects. However, there are few studies on the feature selection for multi-label data based on fuzzy mutual information, and most existing methods neglect the correlation between labels. In this study, we propose two novel multi-label feature selection approaches via label symmetric uncertainty correlation and feature redundancy evaluation. Firstly, we propose the concept of symmetric uncertainty correlation between labels via fuzzy mutual information, and design a label importance weight based on label symmetric uncertainty correlation learning. Further, we define a label similarity relation matrix on multi-label space via the label importance weight. Secondly, we define the symmetric uncertainty correlation between features and labels, and propose the first multi-label feature selection approach. Thirdly, considering the above-proposed method can only get a feature sequence and does not remove the redundancy features, we further propose an improved multi-label removing-redundancy feature selection approach through introducing feature redundancy evaluation. Finally, comprehensive experiments are executed to demonstrate the performance of our methods. The results illustrate that our study is better than other representative feature selection methods.

论文关键词:Multi-label feature selection,Symmetric uncertainty,Fuzzy mutual information,Feature redundancy evaluation

论文评审过程:Received 28 December 2019, Revised 4 March 2020, Accepted 28 July 2020, Available online 20 August 2020, Version of Record 2 September 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.106342