Online and offline streaming feature selection methods with bat algorithm for redundancy analysis
作者:
Highlights:
•
摘要
Streaming feature selection (SFS), is the task of selecting the most informative features in dealing with high-dimensional or incrementally growing problems. Several SFS algorithms have been proposed in the literature. However, they do not consider all feature subsets at the redundancy analysis step due to computational concerns. Moreover, they do not reconsider previously removed features which leads to losing most of the useful information. In this paper, the redundancy analysis step is defined as a binary optimization problem. Then, a binary bat algorithm (BBA) is adopted to find the minimal informative subsets. In this way, a large number of feature subsets can be considered effectively at the redundancy analysis step. In addition, an effective priority list is used to maintain previously removed redundant features. Such a list allows the re-examination of informative features. As a result, it is possible to consider the mutual information between features that are not streamed in an small time interval. Experimental studies on fifteen different types of datasets show that our approach is superior to state-of-the-art online and offline streaming feature selection methods in terms of classification accuracy.
论文关键词:Feature selection,Online feature selection,Streamwise feature selection,Dimension reduction,Bat algorithm
论文评审过程:Received 7 December 2021, Revised 20 May 2022, Accepted 27 August 2022, Available online 29 August 2022, Version of Record 12 September 2022.
论文官网地址:https://doi.org/10.1016/j.patcog.2022.109007