Mixed feature selection in incomplete decision table

作者:

Highlights:

摘要

Feature selection in incomplete decision table has gained considerable attention in recently. However many feature selection methods are mainly designed for incomplete data with categorical features. In this paper, we introduce an extended rough set model, which is based on neighborhood-tolerance relation and is applicable to incomplete data with mixed categorical and numerical features. Neighborhood-tolerance conditional entropy is proposed from this model, which is an uncertainty measure and can be used to evaluate feature subset. It is known that dependency is an important feature evaluation measure based on rough set theory. The comparison and analysis of classification complexity are made between the two measures and it is indicated that neighborhood-tolerance conditional entropy is a more effective feature evaluation criterion than dependency in incomplete decision table. Then the heuristic feature selection algorithm based on neighborhood-tolerance conditional entropy is constructed. Experimental results show that our proposal is applicable and effective to incomplete mixed data.

论文关键词:Mixed feature selection,Incomplete decision table,Neighborhood-tolerance relation,Conditional entropy,Dependency

论文评审过程:Received 29 May 2013, Revised 18 December 2013, Accepted 18 December 2013, Available online 29 December 2013.

论文官网地址:https://doi.org/10.1016/j.knosys.2013.12.018