Reduced gene subset selection based on discrimination power boosting for molecular classification

作者:

Highlights:

摘要

Traditional feature selection methods have two major inappropriate designs in their criterion. Firstly, they trade the profit of relevant information off against the risk of redundant information. Secondly, they cannot get rid of the well-known trap that “the m best features are not the best m features”. There is no necessary inheritance between two consecutive selection rounds. As a remedy for the first problem, we propose a new selection criterion, which concentrates on verifying discrimination boosting effect. A novel feature selection scheme is also proposed in this paper as a mend on the second problem and it can generate multiple subsets with variable feature combinations supporting classification tasks. Our experimental results show that different subsets composed of variable selected features can have so quite similar discrimination power that they might achieve resembled classification quality. These experimental results also verify that our proposed method can successfully explore simple reduced subsets of genes for several genetic datasets with both efficacy and efficiency.

论文关键词:Discrimination power,Feature selection,Molecular classification,Information gain,Cluster analyses

论文评审过程:Received 3 July 2017, Revised 27 November 2017, Accepted 29 November 2017, Available online 2 December 2017, Version of Record 17 January 2018.

论文官网地址:https://doi.org/10.1016/j.knosys.2017.11.036